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Abstract 

In this paper, we study a class of approximation problems, appearing in data approximation 
and signal processing. The approximations are constructed as combinations of polynomial splines 
(piecewise polynomials), whose parameters are subject to optimisation, and so called prototype 
functions, whose choice is based on the application, rather than optimisation. The corresponding 
optimisation problems can be formulated as Linear Least Squares Problems (LLSPs). If the system 
matrix is non-singular, then the corresponding problem can be solved using the normal equations 
method, while for singular cases slower (but more robust) methods have to be used. In this paper 
we develop sufficient conditions for non-singularity. These conditions can be verified much faster 
than the direct singularity verification of the system matrices. Therefore, the algorithm efficiency 
can be improved by choosing a better suited method for solving the corresponding LLSPs. 

Key words: convex optimisation, signal approximation by spline functions and linear least squares 
problems. 
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1 Introduction 

In this paper, we consider two optimisation problems (Model 1 and Model 2), frequently appearing 
in approximation and signal processing. The signal approximations are constructed as products of 
two functions. In the corresponding function products, one of the functions is a polynomial spline 
(piecewise polynomial) and another function is a prototype (also called basis) function, defined by the 
application. Common examples of prototype functions are sine and cosine functions. In Model 1 the 
wave is oscillating around “zero level”, while Model 2 admits a vertical shift. In the case when the 
vertical shift is required (Model 2), we construct it as another polynomial spline. 

The choice of polynomial splines is due to the fact that these functions combine the simplicity of 
polynomials and additional approximation flexibility, which can be achieved by switching from one 
polynomial to another. Therefore, on the one hand, the corresponding optimisation problems can be 
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solved efficiently and, on the other hand, the approximation inaccuracy (evaluated as the sum of the 
corresponding deviation squares) is low. 

It is shown in this paper that Model 1 and Model 2 can be formulated as Linear Least Squares 
Problems (LLSPs). There are a variety of methods to solve such problems [2j|5l|6]. If it is known that 
the corresponding system matrices are non-singular, then the most popular approach for solving the 
corresponding LLSPs is based on the normal equations method, since this method is very efficient 
(fast and accurate) when working with non-singular matrices. If the corresponding matrices are 
singular, one needs to apply more robust methods, for example, QR decomposition or Singular Value 
Decomposition (SVD). These methods are much more computationally expensive. 

In this paper we provide sufficient conditions for non-singularity (both models). The obtained condi¬ 
tions are easier to check than the direct singularity verification of the corresponding matrices. There¬ 
fore, the algorithm efficiency can be enhanced by choosing a better suited approach for solving the 
corresponding LLSPs. 

In this paper, we use truncated power basis function to define splines. Another way to construct basis 
function is through B-splines. B-splines have several computational advantages when running numer¬ 
ical experiments (have smallest possible support, see [9]). However, truncated power functions are 
very common when theoretical properties of the models are the research targets (see, for example i)- 

The paper is organised as follows. In section [2] we formulate the optimisation problems (both models). 
Then, in section[3]we develop sufficient conditions for non-singularity. In section[I]we provide a detailed 
example of how our conditions can be applied. Finally, in section [5] we summarise the obtained results 
and indicate further research directions. 


2 Approximation models 

In this section, we formally introduce polynomial splines and explain why these functions are used in 
our models. Then we formulate the models as mathematical programming problems and demonstrate 
that this problems are LLSPs. 

2.1 Polynomial splines 

Polynomial splines (piecewise polynomials) combine the simplicity of polynomials and the flexibility 
that enables them to change abruptly at the points of switching from one polynomial to another (spline 
knots). These special properties are essential for good quality approximation, since, on the one hand, 
the corresponding optimisation problems are relatively inexpensive to solve and, on the other hand, the 
obtained approximations are accurate enough for reflecting the essential characteristics of the original 
signal (raw data). Therefore, polynomial splines are very commonly used in approximation [71191111], 

There are many ways to construct polynomial splines. One possibility is to do it through truncated 
power functions [9]: 


5 m (x,0 ,t) = x’o + ^2,xijt j + - Oi- 1 )+) 

1=1 1=2 1=1 
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where m is the degree, 9 = (6b,..., 0 n -i) are the knots, xq, xn,..., x nm are the spline parameters and 


(t - 9i- 1 )+ = max{0, (t - 9 t - i)} = 


2 


t - 9i_i , if t> fy_i 
0 , if t < 9i -1 


is the truncated power function. Note that this presentation implies that the polynomial splines are 
continuous. In general, polynomial splines can be discontinuous at their knots [0]. In this paper, 
however, we only consider continuous polynomial splines. 

The spline knots can be free or fixed. If the knots are free then they are considered as additional 
variables in the corresponding optimisation problem and thus x = (xq, xu, ..., x nm , 9) (see next section 
for details). In this case, the corresponding optimisation problem becomes more complex [3lf9lllf)lll2lil3] . 
In particular, it becomes non-convex. Generally, it is much easier to solve a higher dimension fixed 
knots problem (with a considerably larger number of intervals) than a free knots one. 

2.2 Optimisation problems 

In this section we formulate our models as mathematical programming problems. We consider two 
types of models. Model 1 corresponds to the case when the wave is oscillating around “zero level”, 
while Model 2 enables a vertical shift in a form of a polynomial spline. Similar problems have been 
considered in |151I16| . where it appeared that Model 2, in general, is more computationally expensive, 
but its approximation accuracy is significantly higher. 

Consider a signal segment y = (yi, • • • , Vn) € M^, where y*, i = 1,... ,N are evaluated at the time 
moments fj, i = 1 This signal is to be approximated by a function /(f) from one of the 

following two models. 

Model 1 : /(f) = A(M,t)g(t ) (2.2) 

and 

Model 2: f(t) = A 1 (M,t)g(t) + A 2 (L,t), (2.3) 

where A(M,t), Ai(M,t) and A 2 (L,t) are polynomial splines with fixed knots, A, A\ and A 2 are the 
corresponding spline parameters, and y(f) is a prototype function (e.g., sine, cosine). In most cases, the 
choice of the prototype functions is application driven (specified by application experts, e.g., engineer). 

These two approximation problems can be formulated as mathematical programming problems. 

N 

Model 1 : min VVy* - A(x,ti)g(ti )) 2 (2.4) 

X Z - ' 

i =1 


and 


N 

Model 2 : min Y^(y* - A 1 (xi,ti)g{U) 

i= 1 


A 2 {yi 2 ,ti)) 2 


In the following two sections we study these models in depth. 


(2.5) 


2.2.1 Linear Least Squares Optimisation Model 1 

Assume that the spline degree is rri, the number of subintervals is n and the corresponding knots are 

e 0 = h < 01 < e 2 < ■ ■ ■ < 9 n -\ <9 n = t N . 

Model 1 is an LLSP since it can be rewritten as follows: 

min H-Bx — y|||, (2.6) 

X 
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where y 

= (yi, 

...,yiv) r € 

ra* yi, 


are the recorded signals at ij, 

i = and 
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where 


Pij 

= max 

o 

cT + ' 

1 

<■ 0 . 

= 1,... ,n - 1, 

i = 1,... ,1V, 




and a* = g(xi), i = 1,..., N. There exist various methods for solving an LLSP. Most of them are 
based on the normal equations method, QR decomposition and SVD (see ®®m f° r details). If 
B g I^Vx(mn+l) is a full-rank matrix then, the corresponding LLSP can be solved through the system 
of normal equations: 

(R t R)x = B T y , (2.8) 

where y = (yi,... ,vn) T G M n is a signal segment recorded at N distinct consecutive time moments. 
This method is much faster than QR decomposition or SVD but not so accurate.when matrix B is 
singular. Therefore, it is essential to develop a singularity testing procedure to choose a suitable 
method for solving LLSPs. 


2.2.2 Linear Least Squares Optimisation Model 2 


In this model, we assume that the wave (signal) is shifted vertically by a spline function. Similar to 
Model 1, the spline degree is m , the number of subintervals is n and the corresponding knots are 

o 0 = h < e 1 < e 2 < ■ ■ ■ < 0 n —\ <e n = t N . 


The corresponding optimisation problem is an LLSP, formulated as follows: 

min H-Bx — y|||, 

X 


(2.9) 


where x = [xi,X 2 ], y £ 
B can be constructed as 


is the original signal (see (12.51) for details) and B g g^>'(2mn+2) i Matrix 

£>Nx(2mn+2) _ ^Nx(mn+ 1) ^iVx(mn+1)j 


where B 


Nx(mn+1) 

1 


and B 


Nx(mn+1) 

2 


are as follows: 
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and 


B 2 
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( 2 . 11 ) 


1 t N ... t? 0 m 


PN1 ••• 0Nn-l ... 0Nn-l 


where 

0 ij = max{0, U - 0j}, j = 1,..., n - 1, i = l,...,N 


4 








and oti = g(U), i = 1,..., N. Note that B\ is similar to B described in (12.71) . In addition, the rows 
of Bi are obtained from the rows of B -2 through multiplying them by a* (see (I2.1UI) and (12.111) for 
details). 

One can rearrange the columns of matrix B in such a way that the updated matrix contains zero- 
blocks in the top-right corner (matrix M). This can be achieved by splitting the columns of B\ and 
i ?2 into sub-block, where the first sub-block contains m + 1 columns and all the following sub-blocks 
contain m columns (sub-blocks B 1,1 ,..., B l,n and B 2,1 ,..., B 2,n ). Then 

M = [B u ,B 21 ,B 21 ,B 22 ,...,B ln ,B 2n ]. 

Note that rank(R) = rank(M). 

According to the numerical experiments m, B is a rank-deficient matrix and therefore B T B is 
singular. As a consequence, the normal equations method is not efficient and therefore, more robust 
methods such as QR decomposition or SVD are required to solve this optimisation problem [HBjQ]. 
Note that these methods are substantially more expensive than normal equations. 


3 Singularity study 


Let us point out that the matrices B, B\ and B 2 
M such that 

' An 0 
A21 A22 


M = 


can be expresses as a block lower triangular matrix 


0 0 
0 ••• 0 

A33 0 0 


(3.1) 


A n i A n 2 A n 3 • A nn \ 

where Aj\, j = 1,..., n has N/n rows and m + 1 columns, Ajk, j,k = 2,..., n, j > k has N/n rows 
and m columns and the top-right corner {Ajk , for k > j, j, k = 1,..., n) contains zeros since 


max{0, ti — Oi_i} = 0, for all f* < #;_i, l = 2,..., n. 


In some cases, one or more of the time moments can coincide with the corresponding spline knots. To 
avoid possible ambiguity, we assign the time moments into the subintervals according to the following 
subdivision: 

[00,0i], {Ok-iM, k = 2,.. .n. (3.2) 

Therefore, the first subinterval includes both borders while the other subintervals only include the 
right border. 

Note that all the diagonal blocks of (13.11) are rectangular matrices {An, i = 1 ,,n). The number of 
columns in An is m + 1, while the number of columns in An, i = 2,..., n is m. The number of rows 
in An, i = 1,..., n coincided with the number of time moments assigned to the i— th subinterval. 


3.1 Model 1: singularity study 

In this section, we develop a sufficient condition for non-singularity of Model 1 (oscillation around 
“zero”). If this condition is satisfied, we can guarantee that the corresponding matrices are non¬ 
singular and therefore, one can apply the normal equations method that is well-known to be fast and 
efficient for such problems. The following theorem holds. 
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Theorem 3.1 (sufficient non-singularty condition). Suppose that the spline degree is rn, the number 
of subintervals is n and the corresponding spline knots are 

Oo = h < 9\ < ■ ■ ■ < 0 n -1 < 0 n = tx. 

Matrix B is non-singular when the following inequalities satisfy 

N\ — Z\ > m + 1 and Nk — Zk > m, k = 2,..., n, (3-3) 

where Nk, k = 1 ,,n is the total number of recordings for k—th subinterval established in (E2D and 
Zk, k = 1,... ,n is the number of time moments ti in the k—th subinterval, such that g(tf) = 0. 


Proof: Our proof is based on two important facts. 


1. For a rectangular matrix, changing the order of the rows or multiplying a row by non-zero 
constants (elementary row operations) do not change the rank of the matrix. 


2. The determinant of a square Vandermonde matrix 


can be expressed as 


V = 


( 1 
1 
1 


V 2 

^3 


A 

A 

A 


...x\ n 
■ ■■ x 2 m 
...v 3 - 


\1 Xm+1 An+x ■■■A+lJ 


det(V) = (Xj-Xi), 

l<i<j<m 


(3.4) 


and therefore, it can not be zero if all X t , i = 1,..., m are distinct. 


Note that the rows of the diagonal blocks An of m are Vandermonde matrix rows that are multiplied 
by a constant a 3 , j = 1,..., N\, where Ni is the number of time moments assigned to the first interval. 
Therefore, if N\ — Z\ > rn + 1 then, it is possible to extract (m + 1) linearly independent rows from 
the first N\ rows of B. 

For the second interval the situation is similar, but each row is multiplied by 

a.j x (tj — 9\), j = Ni + 1,..., Ni + N 2 - 

Since for the second interval none of the time moments can coincide with 0\ one can conclude that 

(tj ~ 0\) >0, j = N\ + 1,..., Ni + N 2 . 

Therefore, if N 2 — Z 2 > m then it is possible to extract m linearly independent rows from the block 
of the rows N\ + 1,..., N\ + N 2 of B. These rows will be also linearly independent with the m + 1 
rows extracted from the first JVi rows of B. 

By continuing the process, we finally have mn + 1 linearly independent rows and therefore, matrix B 
is indeed full-rank. □ 

Generally, it is not always easy to estimate Zk, k = 1,... ,n. In Section U] we give an example where 
g(t) is a periodical (sine) function and therefore, the corresponding Zk, k = 1,..., n can be estimated 
through the corresponding frequencies. 
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3.2 Model 2 


Consider An and divide it into two parts: any m + 1 rows form the bottom part and the remaining 
rows form the top part. Therefore, An contains four sub-blocks Bm, Bm, -B 211 and #221 • The last 
index indicates that we are working with the block An (first interval). Therefore, 

An = ( ® 1U ® 121 V (3.5) 

V -D211 -D221 J 

B 221 is a full-rank block in An (Vandermonde-type matrix) and if we apply equivalent row operations 
one can obtain zeros everywhere in sub-block Bm using the last m + 1 rows of An- Therefore, there 
exists a unique set of Ah, j = 1,..., m + 1, i = 1,..., m + 1, such that 

ra+1 

-®121 = ^*7^221 > (3-6) 

2—1 

where 2 ^ 2 i is the j—th row of B 121 and B 221 is the i —th row of H 221 . 

The rows of B m and B 211 denote by B{ u and B 211 for j = 1,... ,m + 1 respectively. Then, the rows 
B\n, i = 1, • • •,m + 1 of the updated block Bni (obtained from B m by equivalent row operations) 
are as follows: 

ra+1 

#111 = B ln -E 4^211 • (3.7) 

i= 1 

Therefore, 

-Bill = Bin — VfB2n, (3.8) 

where Ai is an (r7i.+ 1) x (m + 1) matrix with the {ij}-th element equals Ah. Note that An is full-rank 
if and only if 

Bin = Bin — Afi?2ii , (3.9) 

is full-rank. Similar reasoning is applied to remaining intervals. Taking into account that 

• the dimension of each block An, i = 2,..., n is Ni x m , where A 7 ,; is the number of time moments 
assigned to the i —th interval and 

• the left border point is not included in any of each interval. 

In general, for 1 < i < n, (13.91) can be rewritten as follows 

B\u = Bni — AJB 2 U, (3.10) 

where Aj is an m x m matrix with the {jk }-th element equals A* fc . Hence, the following theorem holds 
(sufficient condition for non-singularity of Model 2). 

Theorem 3.2 Suppose that the spline degree is m, the number of subintervals is n and the corre¬ 
sponding spline knots are 

= ti < d\ < ■ ■ ■ < 9 n -1 < 6 n = tN ■ 

It is possible to construct the bottom blocks in each matrix An such that the corresponding matrices in 
h3.1(A ) are full-rank then, B is full-rank too. 

Note that if g(t) is constant then, the condition of Theorem 13.21 are not satisfied. Moreover, it indicates 
that the corresponding matrix in Model 2 is rank-deficit since there exists a column that is obtained 
by multiplication of another column by a = g(U), i = 1,..., n. 
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4 Application to signal processing 


In this section, we give an example of how our conditions can be applied in a particular problem in 
signal processing. In addition, we propose an algorithm for signal approximation where the choice of 
optimisation techniques is based on the corresponding singularity study. 

4.1 Model 1 

Au EEG (electroencephalogram, also known as brain wave) signal is modeled as a sine wave 

Wi = S m (x.,0,t)sm(ut + T), (4.1) 

where S m is the spline function defined in (12.11) whose 6 = (6 1 ,..., 9 n - i) are equidistant therefore, for 
each combination of uj and r the corresponding optimisation problem is 

N 

minVVj/i - S m ( x , 0,ti)sm(uti + r)) 2 3 . (4.2) 

X ZJ 

i =1 

This is an LLSP. To achieve the best combination of uj and r we run a double loop over the defined 
intervals for uj and r, keeping the optimal combinations, that is combinations with the lowest objective 
function values (see section 14.31 for details). Optimisation problem (14.21) is reformulated as 

N 

minV'Vyj — Mx r ) 2 , or min IlMx — , (4.3) 

i —1 

where yt E are the recorded signals at E , x E M mn+1 and M is a matrix with N rows and 
mn + 1 columns of the form S' m (x, 9, ti ) sin (uti + r). If M E ]R Arxmri + 1 is a full-rank matrix then, this 
LLSP can be solved through the systems of normal equations. 

In this application, there are natural restrictions on u and n that are considered in [IB], 

• Frequency uj is a parameter that is normally assigned by a manual scorer (medical doctor). 
Therefore, the value for frequency is bounded from above (by 16 Hz) and restricted to integer 
(due to human scorer’s perception limitations). 

• The duration of the events is between 0.5 and 3 seconds (these events are called K-complexes) [16] . 
It is not reasonable to consider any interval shorter than 1 second. Since the duration of the 
original signal is 10 seconds therefore, the number of subintervals n can not exceed 10. 

We need to show that there are mn + 1 linearly independent rows and therefore, B is a full-rank 
matrix (see Theorem l3.1D . There are three main reasons why there may be fewer than mn + 1 linearly 
independent rows. 

1. Some of sin (uti + t) are zero then, in this case, zero rows will appear. This can happen when 
the frequency is high. 

2. There are too many intervals n, since in this case mn + 1 is large. 

3. The degree m of the corresponding polynomial is high so, in this case, mn + 1 is large. There is 
no (application based) upper bound for m therefore, the choice of m is based on the complexity 
of the signal and computer resources. 


Before we start proving that M is a full-rank matrix, we have to estimate the number of constants 
sin(a;fj + r) being zero (estimation of Z^, k = 1,..., n). It can happen no more than 2 ujD + 1 times 
for a D seconds duration of an EEG. Suppose that the knots 6 1 ,... ,9 n (switches from one polynomial 
to another) are equidistant. According to the reported experiments in [ IB ] N = 1000, n = 5, m = 4 
and uj did not exceed 16 Hz. The duration of each signal segment is 10 seconds therefore, the duration 
of each subinterval is 

D/n = 2 seconds . 


Then, 


Nk = 200, Zk < 2uo x D/n + 1 = 65 , k = 1,..., n , 


and therefore, 


— Zk = 200 — 65 > m = 4, k = 1,..., n. 


Hence, due to the Theorem 13.11 matrix M is non-singular and therefore, the normal equations method 
can be applied. 


4.2 Model 2 

Similar to Model 1, the knots are equidistant, N = 1000, n = 5, m = 4 and uj did not exceed 16 Hz. 
An EEG signal is modeled as a sine wave that is shifted vertically by a spline function as 

W 2 = S m (xi, 6 , t) sin(u4 + r) + S m (x 2 ,6 , t ), (4.4) 

therefore, the corresponding optimisation problem is 

N 

min V ~](yi - S m (xi , 0 , U) sin(wfj + r) - S m (x 2 ,6 , U)) 2 , (4.5) 

X Z ' 

i=l 

where yi € are the recorded signal at fj for i = 1,2,..., N and x = [xi;x 2 ]. The dimension of this 
problem is 2 mn + 2. The optimization problem (14.51) can be rewritten as 

N 

min (yi — Bx r ) 2 , or min ||Hx — yAl'n , (4.6) 

i=l 

where B £ ]g-' v x( 2mri + 2 ). Matrix B in the optimisation problem (14.51) is 

^ATx(2mn+2) _ j^A^xmn+1 ^./Vxmn+lj 

where xmn+1 and H 2 Vxmn+1 are detailed in (lATOl) and (12411) . 

Numerical experiments with Model 2 (see E5I) indicate that B is a rank-deficient matrix. This can be 
anticipated, since the conditions of Theorem 13.21 are not satisfied. In this case, instead of checking the 
rank of an N x (2 mn + 2) matrix one needs to check the rank of several N/nx {mn + 1) matrices. Since 
the number of such matrices can be large, in most cases it is more efficient to check the singularity 
of the original matrix. Numerical experiments in m show that the corresponding matrices are not 
full-rank and therefore, an SVD-based method is applied instead of normal equations. 
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4.3 Algorithm implementation 

In this section, we present an algorithm for solving (14.21) and (14.51) . In most practical problems, u 
and r are not known in advance and therefore, there should be a procedure for choosing them. One 
way is to consider them as additional variables and optimise them. This approach is not very efficient 
since the corresponding optimisation problems become non-convex and can not be solved fast and 
accurately [8]. Therefore, we can assign exact values from defined intervals of uj and r that form a 
fine grid (using double loops) instead of optimising them directly. Then, we solve the corresponding 
LLSPs and keep the best obtained results m- 

The following algorithm can be used to solve a sequence of LLSPs. In this algorithm, ujq and uj are 
the initial and final value for u. Similarly, to and tj are the initial and final values for r. 

Algorithm 1: Signal approximation through LLSPs 

1: Specify the initial and final values for the frequency (cuo and Uf ) and shift (to and t/) 

2 : for u = ujo : Uf do 
3: for T = Tq : Tf do 

4: Solve the corresponding optimisation problem (LLSP) with fixed cj and t; and record the 

minimal value of the objective function. 

5: end for 

6: end for 

In this algorithm, the normal equations method can be used for solving LLSPs with non-singular 
matrix while QR decomposition or SVD should be used for singular cases. 


5 Conclusions and further research directions 

Most linear least square problems can be solved using the system of normal equations if the corre¬ 
sponding matrix is non-singular otherwise one needs to apply a more robust (and time-consuming) 
approach (e.g., QR decomposition and SVD). We consider two types of linear least squares problems 
and develop a procedure which enables us to identify when the corresponding matrix is non-singular. 
Basing on the outcomes of this procedure, one can choose a more efficient method for solving the 
corresponding linear least squares problems. 

Currently, we consider three main future research directions. 

1. The development of necessary and sufficient conditions for non-singularity verification. 

2. The development of more flexible models where the spline of vertical shift does not have the 
same degree and knots location as the main spline (multiplied by prototype functions). 

3. The extension of the results to the case when other types of functions (not necessary polynomial 
splines) are used to construct the corresponding approximations. 
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